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Social inequality is a topic of interest since ages, and has attracted researchers across disciplines 
to ponder over it origin, manifestation, characteristics, consequences, and finally, the question of 
how to cope with it. It is manifested across different strata of human existence, and is quantified in 
several ways. In this review we discuss the origins of social inequality, the historical and commonly 
used non-entropic measures such as Lorenz curve, Gini index and the recently introduced k index. 

We also discuss some analytical tools that aid in understanding and characterizing them. Finally, 
we argue how statistical physics modeling helps in reproducing the results and interpreting them. 


I. INTRODUCTION 

Repeated social interactions produce spontaneous variations manifested as inequalities at various levels. With 
the availability of huge amount of empirical data for a plethora of measures of human social interactions makes it 
possible to uncover the patterns and look for the reasons behind socio-economic inequalities. Using tools of statistical 
physics, researchers are bringing in knowledge and techniques from various other disciplines [lj], e.g., statistics, applied 
mathematics, information theory and computer science to have a better understanding of the precise nature (both 
spatial and temporal) and the origin of socio-economic inequalities that is prevalent in our society. 

Socio-economic inequality is concerned with the existence of unequal opportunities and rewards for various 
social positions or statuses in a society. Structured and recurrent patterns of unequal distributions of goods, wealth, 
opportunities, and even rewards and punishments are the key features, and measured as inequality of conditions, and 
inequality of opportunities. The first one refers to the unequal distribution of income, wealth and material goods, 
while the latter refers to the unequal distribution of ‘life chances’ across individuals, as is reflected in education, health 
status, treatment by the criminal justice system, etc. Socio-economic inequality is held responsible for conflict, war, 
crisis, oppression, criminal activity, political unrest and instability, and affects economic growth indirectly [3]. Usually, 
economic inequalities have been studied in the context of income and wealth [M3- However, it is also measured for 
a plethora of quantities, including energy consumption [lT| . The inequality in society (l2l - fla | is an issue of current 
focus and immediate global interest, bringing together researchers across several disciplines - economics and finance, 
sociology, demography, statistics along with theoretical physics (See e.g., Ref. (j. ITol Il7jb 

Socio-economic inequalities are quantified in numerous ways. The most popular measures are absolute, as quited 
with a single number, in terms of indices, e.g., Gini [l8|, Theil [u|, Pietra [20] indices. The alternative measure 
approach is relative, using probability distributions of various quantities, but the most of the previously mentioned 
indices can be computed once one has the knowledge of the distributions. What is usually observed is that most quan¬ 
tities display broad distributions, usually lognormals, power-laws or their combinations. For example, the distribution 
of income is usually an exponential followed by a power law SHU. „ 

In one of the popular methods of measuring inequality, one has to consider the Lorenz curve [22| , which is a function 
that represents the cumulative proportion X of ordered (from lowest to highest) individuals in terms of the cumulative 
proportion of their sizes Y. X can represent income or wealth of individuals. But it can as well be citation, votes, 
city population etc. of articles, candidates, cities etc. respectively. The Gini index (g) is defined as the ratio between 
the area enclosed between the Lorenz curve and the equality line, to that below the equality line. If the area between 
(i) the Lorenz curve and the equality line is represented as A, and (ii) that below the Lorenz curve as B (See Fig. [I]), 
the Gini index is g = A/(A + B). It is an useful measure to quantify socio-economic inequality. Ghosh et al. [23 
recently introduced the ‘fc index’ (where ‘A;’ symbolizes for the extreme nature of social inequalities in Kolkata), which 
is defined as the fraction k such that (1 — k) fraction of people or papers possess k fraction of income or citations 
respectively [24j |. 
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FIG. 1: Schematic representations for Lorenz curve, Gini index g and k index. The dashed line depicts the Lorenz curve and 
the solid line represents perfect equality. The area enclosed between the equality line and the Lorenz curve is A and that below 
the Lorenz curve is B. The Gini index is given by g = A/(A + B). The k index is given by the abscissa of the intersection 
point of the Lorenz curve and Y = 1 — X (adapted from Ref. EH). 


When the probability distribution is described using an appropriate parametric function, one can derive these 
inequality measures as a function of those parameters analytically. In fact, several empirical evidence have been 
reported to show that the distributions can be put into a finite number of ty pes . Most of them turn out to be a of 
mixture of two distinct parametric distributions with a single crossover point [241 ] . 

This review is organized as follows: Sec. [ill discusses the evolutionary view of socio-economic inequalities and Sec. Mil 
discusses the basic and most popular quantities which are used for measuring inequalities. Next, in Sec. lIVI we discuss 
the most realistic scenario in which probability distributions have to fitted to more than a single function, and how 
to measure inequalities from them. Further, we discuss if inequalities are natural, in context of the recent works of 
Piketty in Sec. |V1 followed by a section on statistical physics modeling in Sec. IVII Finally, we summarize in Sec. m 


II. EVOLUTIONARY VIEW OF SOCIO-ECONOMIC INEQUALITY 

The human species is known to have lived their life as hunter gathers for more than 90% of its existence. It is 
widely believed that those early societies were egalitarian, still seen in the lifestyle of various tribes like the IKung 
people of the Kalahari [0. Early hunter-gatherer societies are even believed to have championed sex equality [26j . 
Traditionally having a few possessions, they were semi nomadic in the sense they were moving periodically. Having 
hardly mastered farming and lived as small groups, the mere survival instinct was driving them to overlook individual 
interests. They were sharing what they had, so that all of their group members were healthy and strong, be it food, 
weapons, property, or territory [27] . With the advent of agricultural societies, elaborate hierarchies were created, with 
less stable leadership in course of time. These evolved into clans or groups led mostly by family lines, which eventually 
developed as kingdoms. In these complex scenarios, new strategies for hoarding surplus produce of agriculture or goods 
were adapted by the chiefs or kings, predominantly for survival in times of need, and this lead to the concentration of 
wealth and power (see Ref. [28} for models with savings). Along with the advancement of technologies, intermediate 
mechanisms helped in wealth multiplication. This completed the transition from egalitarianism to societies having 
competition and the inequality paved way for the growth of chiefdoms, states and industrial empires [10. 
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III. BASICS AND GENERIC PROPERTIES OF INEQUALITY MEASURES 


In this section, we formally introduce the measures to quantify the degree of social inequality, namely, Lorenz curve, 
Gini index and k index. 

The Lorenz curve shows the relationship between the cumulative distribution and the cumulative first moment of 
P(m): 


X(r) = f P{m)dm 1 Y{r ) 
J m 0 


f * ng mP(m)dm 
mP(m)dm 


(1) 


The set (X(r), Y(r)) defines the Lorenz curve, assuming P(m) to be defined in [mo,oo). Fig. |T| shows the typical 
behavior of Lorenz curve. The Lorenz curve gives the cumulative proportion X of ordered individuals (from lowest to 
highest) holding the cumulative proportion Y of wealth. Lorenz curve, Gini index etc. were historically introduced in 
the context of income/wealth. So, let us call X as individuals and Y as ‘wealth’, but in principle the attributes X and 
Y can be any of the combinations like article/citations, candidate/vote, city/population, student/marks, company/ 
employee etc. Hence, when all individuals take the same amount of wealth, say m' , we have P(m) = S(m — m 1 ), with 
mo < m' < oo, and one obtains 


X(r) = f 5(m — m')dm = 0(r — m'), Y(r ) 

J mo 


where 0(:r) is a step function defined by 


Jmo mS (m-m')dm m'0(r - m') 
mS(r — m')dm ml 


0(:r) = 


1 , 

0 , 


x > 1, 
x < 1. 


( 2 ) 

(3) 


Thus Y = X is the ‘perfect equality line’ (see Fig. [!}, where X fraction of people takes X fraction of wealth in 
the society. On the other extreme, if the total wealth in the society of N persons is concentrated to a few persons, 
P(m) = (1 — e)S my o + sS m ,i, with e ~ 0(1/N) and with the total amount of wealth is normalized to 1, we get 
X(r) = 1 —£ + £5 rj i and Y(r) = (5 r p. Hence, Y = liffX = r = l and Y = 0 otherwise, and the Lorenz curve is given 
as ‘perfect inequality line’ Y = Sx.i where S x , y is a Kronecker’s delta (see Fig. [T]). 

For a given Lorenz curve, the Gini index is defined by twice of area between the curve (X(r),Y(r)) and perfect 
equality line Y = X. (shaded part ‘A’ in Fig. [I]). It reads 

r 1 r°a rfv 

g = 2 J Q (X - Y)dX = 2 J (X(r) - Y(r))^-dr, (4) 

where X _1 (0) = ro,X _1 (l) = oo should hold. Graphically (see Fig. |T|) , the Gini index is the ratio of the two areas 
(‘A’ and ‘B’), g = A/(A + B). Thus, the Gini index g is zero for perfect equality and unity for perfect inequality. The 
Gini index may be evaluated analytically when the distribution of population is obtained in a parametric way. 

The recently introduced k index is the value of X-axis for the intersection between the Lorenz curve and a straight 
line Y = 1 — X. For the solution of equation X(r ) + Y(r) = 1, say r* = Z~ x (l), Z(r) = X(r ) + Y(r), the k index 
is given by k = X(r*). Thus, the value of k index indicates that k fraction of people shares totally (1 — k) fraction 
of the wealth. Hence, the k index equals 1/2 for perfectly equal society, and 1 for perfectly unequal society. This is 
obviously easier to estimate by eyes in comparison with the Gini index (shaded area A in Fig.[T|). 

Apart from these indices, Pietra’s p index |2()i | and m or median index [dfl has been used as inequality measures, 
and can be derived from the Lorenz curve. The p index is defined by the maximal vertical distance between the Lorenz 
curve and the line of perfect equality Y = X, while the m index is given by 2 m — 1 for the solution of Y(m) = 1/2. 


IV. RESULTS FOR MIXTURE OF DISTRIBUTIONS 

It is rather easy to perform analytic calculations for the g and k indices when the distribution of population are 
described by parametric distributions such as a uniform, power law and lognormal distributions. It is quite common 
to find that the probability distributions of quantities like wealth, income, votes, citations etc. fit to more than 
one theoretical function depending on the range. Formally, P(m ) = Fi(m)9(m,m x ) + Fi(m)Q(rn — m x ), with 
6{m,m x ) = 0(m) — 0(m — m x ), where m x is the crossover point. The functions F\(m) and F^m) are suitably 
normalized and computed for their continuity at m x . In such cases, one can also develop a framework [24[, using 
which it is reasonably straightforward to calculate Lorenz curve, Gini and fc-index. 
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FIG. 2: Left: Income distribution p(w) for USA (jjH f° r a few years, the data is rescaled by the average income (w). The lower 
income range fits to an exponential form aexp(-bw) with b = 1.3 while the high income range fits to a power law decay cw~ d 
with exponent d = 2.52. Right: Same for Denmark I32|. The middle and upper ranges fit to different power laws cw~ d with 
d = 2.96 and CiW~ dl with d\ = 4.41. Taken from Ref. [24ll . 


A. Empirical data 

The most well studied data in the context of socio-economic inequality is that of income. Incomes are re-calculated 
from the income tax data reported in the Internal Revenue Service (IRS) [311 of USA for 1996-2011. This data is used 
to compute the probability distribution P(w) of income w for each year. Similar data from Denmark were used from 
the years 2000-2012 [32]. The g index is found to be around 0.54 — 0.60 for USA and 0.34 — 0.38 for Denmark, while 
the fc-index is around 0.69 — 0.71 for USA and 0.65 — 0.69 for Denmark j24|. Fig. [5] shows both these data sets. 

The data related to several socio-economic measures were analyzed. The (i) voting data of proportional elections 
from a few countries [33])., (ii) citations of different science journals and institutions, collected from ISI Web of 
Science [34]], and (iii) population data for city sizes for Brasil [35] . municipalities of Spain [36] and Japan [37] were 
analyzed. The data showed broad distributions of the above quantities, which well-fitted to (see Ref. [24j for details) 
(a) a single lognormal, (b) a single lognormal with a power law tail, (c) uniform with a power law tail, (d) uniform 
with a lognormal tail, (e) a mixture of power laws, (f) a single power law with a lognormal tail. 

The inequality indices computed using a combination of fitting functions were compared with that computed directly 
from the empirical data. 


V. IS WEALTH AND INCOME INEQUALITY NATURAL? 

One is often left wondering whether inequalities in wealth and income are natural. It has been shown using 
models [38j and their dynamics that certain minimal dynamics over a completely random exchange process and 
subsequent entropy maximization produces broad distributions. Piketty [39| argued recently that inequality in wealth 
distribution is quite natural. He pointed out that before the great wars of the early 20th century, the strong skewness 
of wealth distribution was prevailing as a result of a certain ‘natural’ mechanisms. The Great Depression that followed 
after the two great World Wars have helped in dispersion of wealth. This in turn brought the prevailing extreme 
inequality under check and subsequently gave rise to a sizable middle class. After analyzing very accurate data, 
he concluded that the world is currently ‘recovering’ back to this ‘natural state’, which is happening due to capital 
ownership driven growth of finance [l2] , which has been dominant over a labor economy, and this is simply the result of 
the type of institution and policies that are adopted by the society. His work raises quite fundamental issues concerning 
bot only economic theory but also the future of capitalism. It points out the large increases in the wealth/output 
ratio. According to standard economic theory, such increases are attributed to the decrease in the return to capital 
and an increase in the wages. However, the return to capital has not diminished, while the wages have. He has also 
prescribed the following: higher capital-gains and inheritance taxes, higher spending towards access to education, 
tight enforcement of anti-trust laws, corporate-governance reforms that restrict pay for the executives, and finally, 
the financial regulations which have been an instrument for banks to exploit the society. It is anticipated that all of 
these might be able to help reduce inequality and increase equality of opportunity. There is further speculation that 
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this might be able to restore the shared and quick economic growth that characterized the middle-class societies in 
the mid-twentieth century. 


VI. STATISTICAL PHYSICS MODELING 

One of the most efficient ways to model evolution of systems to broad distributions showing strong inequality, is by 
using the toolbox of statistical physics. Microscopic and macroscopic modeling helps in imitating real socio-economic 
systems. 

There is a whole body of empirical evidence supporting the fact that a number of social phenomena are characterized 
by emergent behavior out of the interactions of many individual social components. Recently, the growing community 
of researchers have analyzed large-scale social dynamics to uncover certain ‘universal patterns’. There has also been 
an attempt to propose simple microscopic models to describe them, in the same spirit as the minimalistic models 
commonly used in statistical physics [1|| U3 • 


A. Income &: wealth distribution 

In case of wealth distributions, the popular models are chemical kinetics motivated Lotka-Volterra models |40 42jl, 
polymer physics inspired models [43| and most importantly, models inspired by kinetic theory of gases [28l. l44l - [4q ] 
(see Ref. !9j for details). The two-class structure @j of the income distribution (exponential dominated low income 
and power law tail in the high incomes) is well understood to be a result of very different dynamics of the two classes. 
The bulk is described by a process which is more of a random kinetic exchange fill . |45[ , producing a distribution 
dominated by an exponential functional form. The dynamics is very simple, as described in the kinetic theory of 
gases [47]. The minimal modifications that one can introduce are additive or multiplicative terms. 

Processes creating inequality involving uniform retention rates [48j or equivalently, savings [45| produce Gamma-like 
distributions. These models are defined as a microcanonical ensemble, with fixed number of agents and wealth. Here, 
the wealth exchanging agent retains a certain fraction (termed as ‘saving propensity’) of what they had before each 
trading process and randomly exchanges the rest of the wealth. When agents are assigned with the same value of the 
‘saving propensity’ (as in Ref. 0]); it could not produce a broad distribution of wealth. What is important to note 
here is that the richest follow a different dynamic from the poor and thus heterogeneity in the saving behavior plays 
a crucial role. So, to obtain the power law distribution of wealth for the richest, one needs to simply consider that 
each agent is different in terms of how much fraction of wealth they will save in each trading [46| , which is a very 
natural ingredient to assume, because it is quite likely that agents in a trading market think very differently from one 
another. In fact, with this small modification, one can explain the entire range of the wealth distribution [281 ]. These 
models, moreover, can show interesting characteristics if the exchange processes and flows are made asymmetric, e.g., 
put on directed networks [49}. A plethora of variants of these models, results and analyses find possible applications 
in a variety of trading processes [9|. 


B. Cities & firms 

City (50) and firm sizes [HlJ consistently exhibit broad distributions with power law tails for the largest sizes, 
commonly known to be Zipf’s law. 

Gabaix [ 52 } showed that if cities grow randomly at the same expected growth rate and the same variance (Gibrat’s 
law [Hj]), the limiting distribution converges to the Zipf’s law. He proposed that growth ‘shocks’ are random and 
they impact utilities in both positive and negative way. A similar approach resulted in diffusion and multiplicative 
processes [54| . Shocks were also used to immitate sudden migration [5a |. Simple economics arguments demonstrated 
that expected urban growth rates were identical irrespective of city sizes and variations were random normal deviates, 
resulting Zipf law with exponent unity. 


C. Consensus 

Consensus in social systems is an interesting topic, due to its dynamics. The dynamics of agreement and disagree¬ 
ment in a ‘society’ is complex, and statistical physicists working on opinion dynamics have been brave enough to 
model opinion states in a population and their dynamics that determine the transitions within such states. A huge 
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body of old and recent literature EM3 discusses models that explain various social phenomena and the observed 
inequality in such instances of consensus formation. 


D. Bibliometrics 

The increasing amount of data produced from bibliometric tools have led to a better understanding of how re¬ 
searchers and their publications ‘interact’ with one another in a ‘social system’ consisting of articles and researchers. 
The patterns of citation distribution and growth are now well studied, and some of the most successful models have 
used statistical physics [56] . 

Statistical physics tools have aided in formulating these microscopic models, which are simple enough yet rich in 
terms of socio-economic ingredients. Toy models help in understanding the basic mechanism at play, and demonstrate 
the crucial elements that are responsible for the emergent distributions of income and wealth. A variety of models, 
ranging from zero-intelligence variants to the more complex agent based models (including those incorporating game 
theory) have been proposed over years and are found to be successful in interpreting the empirical results [91] . Simple 
modeling is also effective in understanding how entropy maximization produce distributions which are dominated by 
exponentials, and also explaining the reasons for aggregation at the high range of wealth, including the power law 
Pareto tail [§[, Q . 


VII. SUMMARY AND DISCUSSIONS 

Social inequalities are manifested in several forms, and are recorded well in history, being the reason of unrest, crisis, 
wars and revolutions. Traditionally a subject of study of social sciences, though scholars from different fields have 
been investigating the causes and effects from a sociological perspective, and trying to understand its consequence on 
the prevailing economic system. Reality is not as simple and pointing out the causes and the effects are much more 
complex. 

Imagine a the world which is very equal, where it would have been difficult to compare the extremes, differentiate 
the good from the bad, hardly any leadership people will look up to, will lack stable ruling governments if there were 
almost equal number of political competitors, etc. 

The recent concern about the increase of inequality in income and wealth, as pointed out from different measure¬ 
ments [13 has renewed the interest on this topic among the leading social scientists across the globe. Society always 
had classes, and climbing up and down the social ladder (57l [58[ is quite difficult to track, until recent surveys which 
provide some insight into the dynamics. Several deeper and important issues of our society Q still need attention in 
terms of inequality research, and this can only be achieved by uncovering hidden patterns on further analysis of the 
available data. 

Measurement of inequalities in society can be as simple as measuring measuring zeroth order quantities as Gini 
index, to finding exact probability distributions. The complexity of the underlying problems have inspired researchers 
to propose multi-dimensional inequality indices j5f|, which serve well in explaining a lot of factors in a compact form. 

As physicists, our interests are mostly concentrated on subjects which are amenable to modeling using macroscopic 
or microscopic frameworks. Tools of statistical physics can very well explain the emergence of broad distributions 
which are signatures of inequalities. The literature already developed, contains serious attempts to understand socio¬ 
economic phenomena, under Econophysics and Sociophysics [60]. The physics perspective brings alternative ideas 
and a fresh outlook compared to the traditional approach taken by social scientists, and is reflected in the increasing 
collaborations between researchers across disciplines [lj. 
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